CBT Campus' Online Skills Training Courses.

IT Skills

Software Development

Generic Languages

Overview of Statistical Analysis and Modeling in R

it_fedawr_04_enus

it_dasamrdj_07_enus

it_dasamrdj_05_enus

it_dasamrdj_06_enus

it_dasamrdj_04_enus

it_dasamrdj_03_enus

it_dasamrdj_02_enus

it_dasamrdj_01_enus

Final Exam: Statistical Analysis and Modeling in R

Course Number:
it_fedawr_04_enus

Expected Duration (hours)
0.0

Lesson Objectives

Final Exam: Statistical Analysis and Modeling in R

analyze data that follows a uniform distribution
check the assumptions of the paired samples t-test
compare and contrast population metrics with sample metrics
construct hypothesis statements in the context of a statistical test
describe the bias-variance trade-off
estimate parameters of the population and interpret confidence intervals
examine and interpret the data for regression
examine and visualize data for regression
explore and pre-process data before model fitting
explore and visualize the relationships in data
find the optimal number of clusters using the elbow method and Silhouette score
fit and interpret the S-curve of logistic regression
fit a straight line on data to build a regression model and evaluate the model
implement the one-sample t-test and interpret results
interpret QQ plots for normally and non-normally distributed data
investigate and visualize data before fitting a model
outline the main characteristics of ensemble learning
perform regression using decision trees
perform regression using random forest
perform simple linear regression with a single predictor
perform the one-sample t-test and interpret results
perform the Wilcoxon signed-rank test
posit the null hypothesis and alternative hypothesis of a statistical test
recall characteristics of overfitted and underfitted models
recall implications of the p-value and significance level alpha
recall measures of central tendency and measures of dispersion
recall the assumptions made by the ANOVA test
recall the assumptions made by the one-sample t-test
recall the assumptions made by the two-sample t-test
recall the basic characteristics of machine learning models
recall the basic structure of decision tree models
recall the characteristics of discrete and continuous probability distributions
recall the key metrics to evaluate classifiers
recall the sets of statistical tools used to understand data
recall the techniques used to evaluate clustering models
sample and analyze data that follows a uniform distribution
summarize the differences and use cases for parametric and non-parametric models
train a model on an imbalanced dataset
train and evaluate a logistic regression model
use decision tree models for prediction

Overview/Description

Final Exam: Statistical Analysis and Modeling in R will test your knowledge and application of the topics presented throughout the Statistical Analysis and Modeling in R track of the Skillsoft Aspire Data Analysis with R Journey.

Target

Prerequisites: none

Statistical Analysis and Modeling in R: Building Regularized Models & Ensemble Models

Course Number:
it_dasamrdj_07_enus

Expected Duration (hours)
1.5

Lesson Objectives

Statistical Analysis and Modeling in R: Building Regularized Models & Ensemble Models

discover the key concepts covered in this course
recall characteristics of overfitted and underfitted models
describe the bias-variance trade-off
examine and interpret the data for regression
perform ordinary least squares (OSL) regression
prepare data to build regularized regression models
perform and evaluate Ridge regression
perform and evaluate Lasso regression
perform and evaluate ElasticNet regression
outline the main characteristics of ensemble learning
examine and visualize data for regression
perform regression using decision trees
perform regression using random forest
summarize the key concepts covered in this course

Overview/Description
Understanding the bias-variance trade-off allows data scientists to build generalizable models that perform well on test data. Machine learning models are considered a good fit if they can extract general patterns or dominant trends in the training data and use these to make predictions on unseen instances. Use this course to discover what it means for your model to be a good fit for the training data. Identify underfit and overfit models and what the bias-variance trade-off represents in machine learning. Mitigate overfitting on training data using regularized regression models, train and evaluate models built using ridge regression, lasso regression, and ElasticNet regression, and implement ensemble learning using the random forest model. When you're done with this course, you'll have the skills and knowledge to train models that learn general patterns using regularized models and ensemble learning.

Target

Prerequisites: none

Statistical Analysis and Modeling in R: Performing Classification

Course Number:
it_dasamrdj_05_enus

Expected Duration (hours)
1.6

Lesson Objectives

Statistical Analysis and Modeling in R: Performing Classification

discover the key concepts covered in this course
recall the key metrics to evaluate classifiers
fit and interpret the S-curve of logistic regression
train and evaluate a logistic regression model
train and evaluate a logistic model using all predictors
train a model on an imbalanced dataset
interpret the significance of coefficients, confidence intervals, and odds ratios
evaluate a model built using an imbalanced dataset
use resampling techniques to improve the model
recall the basic structure of decision tree models
explore and pre-process data before model fitting
use decision tree models for prediction
summarize the key concepts covered in this course

Overview/Description
Classification models are used to classify or categorize data points into two or more categories. Learn how these models work and how you can evaluate your classification models using the confusion matrix and metrics such as accuracy, precision, and recall. During this course, you'll perform classification using both logistic regression and an imbalanced dataset. You'll also examine why precision or recall scores may be better metrics than accuracy to evaluate such models. Furthermore, build a classification model using decision trees, visualize the tree structure, and explore the variable importance assigned by this tree structure to understand and interpret the model. When you've finished this course, you'll be able to confidently use logistic regression and decision trees to build classification models and evaluate your models using accuracy, precision, and recall.

Target

Prerequisites: none

Statistical Analysis and Modeling in R: Performing Clustering

Course Number:
it_dasamrdj_06_enus

Expected Duration (hours)
0.8

Lesson Objectives

Statistical Analysis and Modeling in R: Performing Clustering

discover the key concepts covered in this course
recall the techniques used to evaluate clustering models
investigate and visualize data before fitting a model
perform k-means clustering and interpret clustering results
find the optimal number of clusters using the elbow method and Silhouette score
perform k-means clustering on multi-attribute data
summarize the key concepts covered in this course

Overview/Description
Clustering is an unsupervised learning algorithm that self-discovers patterns in data and helps identify logical groupings. Use this course to distinguish between supervised and unsupervised learning and recognize how regression and classification algorithms differ from clustering. Examine the basic principles of clustering models and how k-means clustering finds logical groupings in your data. Learn the evaluation techniques used in clustering and find the optimal number of clusters in your data using both the elbow method and the Silhouette score. Perform clustering on a dataset with multiple attributes and visualize clusters in your data using principal components. When you've completed this course, you'll be able to find groupings in your data using k-means clustering and compute the optimal number of clusters for your data.

Target

Prerequisites: none

Statistical Analysis and Modeling in R: Performing Regression Analysis

Course Number:
it_dasamrdj_04_enus

Expected Duration (hours)
1.0

Lesson Objectives

Statistical Analysis and Modeling in R: Performing Regression Analysis

discover the key concepts covered in this course
recall the basic characteristics of machine learning models
examine how to fit a straight line on data to build a regression model and evaluate the model
identify and visualize the relationships in data
perform simple linear regression with a single predictor
perform multiple regression using multiple predictors
apply the regression model to get predictions for test data
build a regression model using cross-validation
summarize the key concepts covered in this course

Overview/Description
Regression models are used to predict continuous values and are some of the most commonly used machine learning models. Use this course to grasp what exactly machine learning (ML) algorithms are and how you can use ML models to predict outcomes based on input data. Learn how regression models work, train them, and evaluate regression results using metrics such as R2 and RMSE. Perform regression analysis in R using the ordinary least squares regression. Build models using simple and multiple regression and train a regression model using cross-validation. Upon completing this course, you'll be able to perform regression to predict continuous values and evaluate these models using metrics such as the R2 and adjusted R2.

Target

Prerequisites: none

Statistical Analysis and Modeling in R: Statistical Analysis on Your Data

Course Number:
it_dasamrdj_03_enus

Expected Duration (hours)
2.1

Lesson Objectives

Statistical Analysis and Modeling in R: Statistical Analysis on Your Data

discover the key concepts covered in this course
illustrate the assumptions made one-sample t-tests
perform the one-sample t-test and interpret results
perform variations of the one-sample t-test, namely two-sided, greater, and less one-sample t-tests, and then interpret results
perform the one-sample Z-test and interpret results
illustrate the assumptions made by the two-sample t-test
run the two-sample t-test for equal variances
run Welch's two-sample t-test for unequal variances
perform the paired samples t-test
check the assumptions of the paired samples t-test for violation
perform the Wilcoxon signed-rank test
identify the assumptions made by the ANOVA test
run the one-way ANOVA test and the Tukey HSD test
run the two-way ANOVA test for additive and interaction models
summarize the differences and use cases for parametric and non-parametric models
summarize the key concepts covered in this course

Overview/Description
Hypothesis testing determines whether the educated guesses you've made about your data should be accepted or rejected. T-tests and ANOVA tests are some of the most commonly used methods in hypothesis testing. Knowing how to perform and interpret these tests are core data scientist skills. In this course, get hands-on running statistical tests on your sample data. Test assumptions made by statistical tests, run T-tests, perform ANOVA analysis, and interpret the results. Perform the one-sample t-test and the one-sample Z-test. Run the two-sample t-test to compare data from different groups or categories and the paired samples t-test to compare data across time. When you're finished with this course, you'll have the know-how to run and interpret statistical tests using the R programming language.

Target

Prerequisites: none

Statistical Analysis and Modeling in R: Understanding & Interpreting Statistical Tests

Course Number:
it_dasamrdj_02_enus

Expected Duration (hours)
1.1

Lesson Objectives

Statistical Analysis and Modeling in R: Understanding & Interpreting Statistical Tests

discover the key concepts covered in this course
recall measures of central tendency and measures of dispersion
estimate parameters of the population and interpret confidence intervals
construct hypothesis statements in the context of a statistical test
posit the null hypothesis and alternative hypothesis of a statistical test
recall implications of the p-value and significance level alpha
interpret p-values using significance level alpha
recognize the use of t-tests to compare the means of two groups
explore the ANOVA (analysis of variance) test to compare the means of two or more groups
summarize the key concepts covered in this course

Overview/Description
Statistical analysis involves making educated guesses known as hypotheses and testing them to see if they hold up. Use this course to learn how to apply hypothesis testing to your data. Examine the use of descriptive statistics to summarize data and inferential statistics to draw conclusions. Learn how population parameters differ from summary statistics and how confidence intervals are used. Discover how to perform hypothesis testing on sample data, construct null and alternative hypotheses, and interpret the results of your statistical tests. Investigate the significance of the p-value of a statistical test and how it can be interpreted using the significance threshold or alpha level. Additionally, examine the most commonly used statistical tests, the T-test and the analysis of variance (ANOVA). When you're done, you'll have the confidence to set up the null and alternative hypotheses for your tests and interpret the results.

Target

Prerequisites: none

Statistical Analysis and Modeling in R: Working with Probability Distributions

Course Number:
it_dasamrdj_01_enus

Expected Duration (hours)
1.6

Lesson Objectives

Statistical Analysis and Modeling in R: Working with Probability Distributions

discover the key concepts covered in this course
recall the sets of statistical tools used to understand data
compare and contrast population metrics with sample metrics
recall the characteristics of discrete and continuous probability distributions
sample and analyze data that follows uniform distribution
sample and analyze data which follows binomial distribution
calculate probabilities of events in the binomial distribution
sample and analyze data which follows uniform distribution
examine and interpret normal distributions and exponential distributions
interpret QQ plots for normally and non-normally distributed data
use QQ plots to compare samples from different distributions
summarize the key concepts covered in this course

Overview/Description
Interpreting data is a core pre-processing step in data analysis and modeling. Use this course to practice using various dynamic statistical tools to explore and understand your data. During this course, you'll explore population distributions to model random variables, work with discrete and continuous probability distributions, and use discrete probability distribution types, such as the uniform, binomial, and Poisson distributions. You'll also examine continuous distributions, such as the normal and the exponential distributions. You'll round the course off by learning how to read and interpret QQ plots, which can be used to compare the distributions of two samples of data. When you're finished, you'll be able to use probability distributions to model events and understand your data.

Target

Prerequisites: none